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SUNMARY 


ESTIMATION  OF  A LINEAR  TRANSFORMATION:  URGE  SAMPLE  RESULTS 


The  present  paper  provides  large  sample  strong  consistency  and 

A A 2 

distributional  results  for  the  maximum  likelihood  estimators  B and  o of 
the  regression  slope  matrix  B and  error  variance  in  the  multivariate  "errors 
in  variables"  regression  model  introduced  by  Gleser  and  Watson  (1973),  and 
generalized  by  A.  K.  Bhargava  (1975).  In  Bhargava's  model,  n independent 
observations  x^  = (xj^,  x^)  are  taken  on  pairs  of  random  vectors  x^:  pxl 

and  x^:  rxl,  r <_  p.  It  is  assumed  that  for  each  i = 1,2, ...,n, 

& = BJ^(x^)  and  that  x^  has  a (p+r)-variate  normal  distribution  with 

. 2 2 

covariance  matrix  a I . We  wish  to  estimate  B,  a , and  Sf(xli), 

i = l,2,...,n.  Under  a reasonable  assumption  concerning  the  sequence 

{i»(Xj^)},  we  show  that  B and  r (p+r)cr  are  strongly  consistent  estimators 
2 

of  B and  a , respectively,  as  n + ®,  We  also  obtain  the  limiting  distributions 
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of  n (B-B)  and  n (r  (p+r)a  -a  ).  Using  these  asymptotic  distributions, 
approximate  confidence  region  procedures  for  estimating  B and  a are  suggested. 
In  the  course  of  our  derivations,  we  establish  large  sample  strong  convergence 
and  distributional  results  for  the  noncentral  Wishart  distribution. 
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ESTIMATION  OF  A LINEAR  TRANSFORMATION: 

LARGE  SAMPLE  RESULTS1 

by 

Leon  Jay  Gleser 
Purdue  University 

1.  Introduction.  It  is  well  known  that  the  presence  of  errors  of 
measurement  in  the  independent  variables  in  univariate  linear  regression 
(i.e.f  one  dependent  variable)  makes  the  ordinary  least  squares  estimators 
inconsistent  and  biased.  Models  of  regression  which  incorporate  "errors 
in  variables"  have  been  studied,  and  an  extensive  literature  exists  which 
deals  with  maximum  likelihood  and  generalized  least  squares  estimators  of 
the  parameters  of  univariate  "errors  in  variables"  regression  models 
[Madansky  (1959),  Moran  (1971),  Sprent  (1966),  Williams  (1955)].  Less  is 
known  concerning  the  estimation  of  the  parameters  in  multivariate  "errors 
in  variables"  regression  models,  although  Gleser  and  Watson  (1973)  have 
considered  maximum  likelihood  estimators  (MLE)  of  the  parameters  in  a 
multivariate  "errors  in  variables"  regression  model  in  which  the  number  of 
dependent  variables  equals  the  number  of  independent  variables.  Recently, 
A.  K.  Bhargava  (1975)  has  found  the  MLE  of  the  parameters  in  a multivariate 
"errors  in  variables"  regression  model  in  which  the  number  r of  dependent 
variables  is  no  greater  than  the  number  p of  independent  variables  (r  <_  p). 

It  should  be  noted  that  many  of  the  papers  dealing  with  "errors  in 
variables"  regression  models  speak  instead  of  "estimating  linear  functional 
relationships"  or,  in  the  case  of  Gleser  and  Watson  (1973)  and  Bhargava 
(1975),  of  "estimating  linear  transformations".  Because  the  present  paper 
is  concerned  with  the  model  discussed  by  Bhargava,  we  have  adopted  his 


2 


terminology  for  the  sake  of  continuity.  The  references  at  the  end  of  this 
paper,  particularly  Moran  (1971),  should  be  sufficient  for  the  reader  to 
track  down  related  papers. 

The  model  which  we  adopt  in  the  present  paper  is  the  following.  We 
observe  n independent  pairs  of  random  vectors  x!  = (xj^,  x^),  where  is 
pxl  and  x^  is  rxl,  r _<  p,  i = l,2,...,n.  We  assume  that 


(l.D 


/?a\  /?n\  / !ii\  . 

Xi  = I ) “ ( ) I ) = ^i  + ei’ 

\?2i  / \ ^2i  / \ f 2i  / 


where 

(1-2) 


«2i  * ! fa- 


i = 1,2,. ...n.  We  also  assume  that  the  vectors  e^,  i = 1,2,.. .,n,  are 
i.i.d.  with 


(1.3)  i^(e.)  = 0,  if(e.e!)  = a2I  , 

\i  _ _i_i  u _p+r 

i ■ l,2,...,n.  For  the  purpose  of  inference,  the  common  distribution  of  the 
e^s  is  assumed  to  be  multivariate  normal.  The  parameters  B:  rxp,  a > 0, 
wd  pxl,  i ■ l,2,...,n,  are  assumed  to  be  unknown,  and  are  to  be 

estimated. 

Now  let  us  adopt  a more  compact  notation.  Let 


and 


3 


where  =1,  Ej  are  pxn;  X2>  H2>  E2  are  rxn;  and  r <_  p.  In  terms  of  these 
matrices,  our  model  becomes 


(1.4) 

(1.5) 


X = E + E, 
Z2  - B “1» 


where  the  columns  of  E are  i.i.d.  with  mean  vector  0 and  covariance  matrix 

o2I 

-p+r 


Let 


(1.6) 


W = XX'. 


Let  d,  > d~>. ..>d  > d ,>...>d  > 0 be  the  (ordered)  eigenvalues  of 

1—2—  — p — p+1—  — p+r  — 

W,  and  let 


(1.7) 


D 0 

D * | -max  ~ j = diag(d1,d2,...,dp+r), 
0 ?min i 


where  ?max  = diag(d1,d2,...,dp),  Dmin  = diag(dp+1, . . . ,dp+r) . 


Finally,  let 

satisfy 

(1.8) 

(1.9) 


/-11  -12V 
\ -21  -22/ 


G'G  * GG'  = I 
- - --  p+r' 


(p+r)  x (p+r),  Gu:  pxp. 


W - GDG 1 . 


That  is,  G is  an  orthogonal  matrix  whose  ith  column  is  the  eigenvector 
corresponding  to  d^  i * 1,2,..., p+r. 
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Theorem  1.1  [Bhargava  (1975)].  Under  the  assumption  that  n >_  p+r  and  that 

the  common  distribution  of  the  columns  of  E is  multivariate  normal,  the  MLE 
2 

of  B,  5^,  and  a are  respectively: 


(1.10) 


? ■ 521.°;!  - -fsy'sh* 


(i.ii) 


and 

(1.12) 


!l  - SllSilh  + ®11®21?2 

= (i[p  " 5l25l2)^l  " ^12^22*2’ 


«2  -1  -1 
o = n (p+r)  tr  D . . 

' -min 


. * -1  *2 

In  Section  2,  we  show  that  B and  r (p+r)o  are  sequences  of  strongly 

. 2 

consistent  estimators  of  B and  a , respectively.  Our  results  are  obtained 
without  assuming  that  the  common  distribution  of  the  columns  of  E is  the 
multivariate  normal  distribution.  All  that  is  needed  is  that 


(1.13)  A = lim  n"1  2 Z' 

- n-x»  ~1'1 

exists  and,  in  the  case  of  the  strong  convergence  of  B,  is  positive  definite. 

_1  A A 

Interestingly,  n Is  not  a sequence  of  consistent  estimators  of  A. 

However,  at  least  one  sequence  of  strongly  consistent  estimators  of  A does 
exist,  as  we  show  in  Section  2. 

Section  3 considers  large  sample  distributional  results  for  n^fi-B)  and 
lj  _1  *2  2 

n (r  (p*r)a  - o ).  If  the  elements  of  any  column  of  E have  finite  fourth 
moments,  both  n^B-B)  and  n^r  *(p+r)o2  - a2)  are  asymptotically  normal. 

The  covariance  matrix  of  the  asymptotic  multivariate  normal  distribution  of 
n (B-B)  is  determined  in  the  special  case  when  the  columns  of  E have  a common 
multivariate  normal  distribution;  and  a strongly  consistent  sequence  of 
estimators  of  this  covariance  matrix  is  established.  These  results  lead  to 
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an  approximate  large  sample  100(l-a)%  elliptical  confidence  region  for  B. 

In  Section  3,  we  also  obtain  an  approximate  large  sample  100(l-o)% 

2 

confidence  interval  for  a . 

The  method  of  proof  that  we  use  in  Sections  2 and  3 (particularly 
Section  3)  makes  use  of  explicit  representations  of  our  estimators  in  terms 
of  weighted  matrix  sums  of  the  elements  of  W,  and  thus  requires  us  to 
establish  strong  consistency  and  large  sample  distributional  results  for 
the  elements  of  this  matrix.  We  do  this  both  under  general  assumptions 
about  the  common  distribution  of  the  columns  e^,  e2»'**  E,  and  under  the 
particular  assumption  that  the  columns  of  E have  a common  multivariate 
normal  distribution.  In  the  latter  case,  W has  a noncentral  Wishart 
distribution,  and  our  results  in  Sections  2 and  3 provide  large  sample 
strong  consistency  and  distributional  results  for  the  noncentral  Wishart 
matrix. 


2.  Strong  Consistency.  We  begin  by  investigating  the  strong  convergence 

of  W. 

Lemma  2.1.  Assume  that  e^,  e2»...  are  i.i.d.  with  common  mean  vector  0 and 

~ 2 ~ 

common  covariance  matrix  a I . Let 

~p+r 


(2.1) 


0 = al  + 
- _p+r 


V 


~l  B 


where  A is  defined  by  (1.13),  and  is  assumed  to  exist.  Then 


-1. 


lim  n W = 0,  a.s. 
n-x«> 


(2.2) 

Proof.  From  (1.4)  and  (1.6), 


(2.3) 


n_1W  - n_1EE'  + n"1EE'  ♦ n-1Es'  ♦ n_1SE' 
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Since  the  n columns  of  E are  i.i.d.  with  common  mean  vector  0 and  common 


covariance  matrix  o I , we  have  from  the  SLLN  that 

-p+r 


(2.4) 


-1  2 

lim  n EE'  = a I a.s. 


From  (1.5)  and  (1.13), 


(2.5) 


lim  n-1EE'  = I ~ *)  A ( ~P  ] ' 
n-*»  " \ B / “ \ B / 


Thus,  (2.2)  holds  if 


(2.6) 


lim  n *E'E'  = lim  (n  *EE')'  = 0,  a.s. 
n-x»  ~ ~ n-*“ 


Let  5 = C(e±j)),  E = ((e..)),  A(n)  = n_1S'E  = ((aj?J)).  Finally,  let 

hPP  = [ l (5..)2]“^C...  Then  for  all  (i,j),  i,  j = l,2,...,p+r, 

1J  k=l  1K 

•S3--’1  l 5ikekj  = tn_1  X ekj* 

By  (2.5),  n_1  £ converges  to  a finite  nonnegative  number.  But  by 

k=l  1 n fnl  2 

Lemma  2 of  Gleser  (1966),  noting  that  £ (h..^)  = 1,  we  have 

k=l  1K 

lim  n_>5  l hj|^  e^  = 0,  a.s. 


n->»  k=l 


Thus  for  all  (i,j),  i,j  = l,2,...,p+r,  lim  a!-?^  = 0,  a.s., 

n->®  1J 

proving  (2.6),  and  thus  (2.2).  □ 

Remark.  If  e^.e^...  have  common  covariance  matrix  Z:  (p+r)x(p+r)  and  if 

lim  n *SE'  = T exists,  then  a proof  identical  to  that  of  Theorem  2.1  can  be 
n—  ~ ~ 

used  to  demonstrate  that  lim  n W = Z * T,  a.s. 
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L Jj 

Let  Yj  Y2—  **— Yp  1 0 be  the  eigenvalues  of  (Ip+B'B)  A(Ip+B'B)  , 
where  (Ip+B'B)^  is  the  symmetric  square  root  of  (Ip+B'B).  Let 
Dy  =■  diag(Y1,  y2,...,Y  ) and  let  <(/  be  a pxp  orthogonal  matrix  satisfying 


(2.7) 


l k 

(Ip+B'B)  A(Ip+B'BK  = W>y^'. 


Note  that  if 


(2.8) 


(Ip+B'B) 


B'(I  +BB')_l5 
« «*r  «•  % 


r 3 -j*  -j*  b 

\B(I  +B'B)  i|i  - (I  +BB')  ’ / 


- -P  - - 


\c 

where  (I^-t-BB ' ) is  the  symmetric  square  root  of  (Ir+BB'),  then 
T:  (p+r)x(p+r)  is  orthogonal,  and 


(2.9) 


(0*1  +D  0 

-p  -Y  - 

0 0 I 


We  conclude  that  the  columns  of  r are  eigenvectors  of  0,  and  that  the 

eigenvalues  9,  > 9 >...>9  > 0 of  0 are: 

l — i — — p+r  — 


(2.10) 


6i  = ° + V 

9 . = a2 

P+J 


i = 1»2, . . . ,p, 
j = 1*2,. • • »r. 


Lemma  2.2.  Under  the  conditions  of  Lemma  2.1, 


(2.11) 


lim  n D = D = diag(0  ,0  ,...,0  ),  a.s. 

n-*»  ~ 1 c P 


Proof.  Under  our  assumptions  about  the  vectors  e^,  e2,...,  we  know  that 
n 1W  is  positive  definite  for  all  n _>  p+r  [Perlman  and  Eaton  (1973)].  The 
ith  eigenvalue  of  a positive  definite  matrix  is  a continuous  function  of 
the  elements  of  that  matrix.  Since  n *W  a.s.  converges  to  a positive 


definite  matrix  0 by  Lemma  2.1,  the  result  (2.11)  immediately  follows.  □ 


In  the  following  argument,  we  will  need  to  notationally  indicate  the 


dependence  of  our  sample  quantities  on  the  sample  size  n.  Thus,  for 


be  the  quantities  defined  by  (1.6),  (1.7),  and  (1.9)  respectively.  Further, 
let  be  the  estimator  of  B for  sample  size  n given  by  (1.10). 

Lena  2.3.  Under  the  assumptions  of  Lemma  2.1,  plus  the  additional 
assumption  that  A is  positive  definite,  we  have 


(2.12) 

(2.13) 

(2.14) 


lim  B 
n-*»  ~ 


(n) 


B,  a.s., 

lia  n_1 (®22^) (?min) (?22^) ' 3 ^(I,*88')'1.  a.s., 

n ’■ 

a.n-1(o;;>)(DW)tcW).  . A . .... 

Proof.  Note  that  the  columns  of  are  orthogonal  and  of  length  1 for  all 
n > p+r.  Let 


be  a fixed  point  in  the  underlying  probability  space.  For  fixed  u such  that 
(2.2)  and  (2.11)  hold,  the  sequence  (G^  } lies  in  a compact  subspace  of 
(p*r)  -dimensional  Euclidean  space.  Thus,  each  subsequence  of  (G^ } has 
a convergent  sub- subsequence.  Suppose  that  the  limit  of  this  sub-subsequence 
is 


Then  since  for  all  n. 
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we  can  take  limits  over  the  indices  of  the  sub-subsequence  on  both  sides  of 
this  equality  and  obtain  [see  (2.2),  (2.10)  and  (2.11)] 


- (-") 
~ V.S21/  \ 821  / 


(o^I  +D  ), 
-P  -Y 


Thus,  (Qjj.  Q^)'  is  in  the  ei8ei»subspace  corresponding  to  the  largest  p 

roots  of  9.  Since  our  additional  assumption  (that  A is  positive  definite) 
2 2 

implies  that  » o > a =■  6p+1,  tb*s  ei8ensubsPace  is  unique.  Hence, 
from  (2.8)  and  (2.9)  there  exists  a nonsingular  matrix  T such  that 


(2. IS) 


Again,  since 


/ Qn\  /(I  ♦B'B)-%  \ 

I ) - -£  T- 

\Q2i/  \B(I  ♦B'B )**/" 


S(n)  r(n)fr(n).-l 

! * .21  (?11  > • 


taking  limits  on  both  sides  of  this  equality  over  the  indices  of  the  sub- 
subsequence results,  by  (2.15),  in  the  limiting  value  B.  Thus,  we  have 
shown  that  for  every  value  u such  that  (2.2)  and  (2.11)  holds,  every 
subsequence  of  {8^ ) has  a subsubsequence  converging  to  B.  It  then  follows 

from  facts  about  limits  of  sequences  in  Euclidean  space  that  lim  B^  ■ B 

It***  ~ 

for  all  w such  that  (2.2)  and  (2.11)  hold,  and  thus  that  (2.12)  holds. 

The  results  (2.13)  and  (2.14)  follow  by  similar  arguments  using  the 
identities  [see  (1.9)  and  (1.10)] 

(2.16)  (»,‘;r)(»"Vn))(!.-jr)’ 

- (8(n>-B)  (cj^)  (n'1^)  (G^)  • (§(n)-B)  • 

♦ (Ir*BB(n>  •)  (GffHn’y^)  (G^)  • (Ir*B(n)B») 


respectively.  □ 


Fro*  (1.11),  (1.6),  and  (1.9),  we  see  that 


(2.18) 


«ll?max>il* 


.1  A * 

It  thus  follows  from  (2.14)  that  (n  Epp  is  not  a consistent  sequence  of 
estimators  for  A.  Since  A helps  to  determine  the  covariance  matrix  of  the 

L A 

asymptotic  distribution  of  n (B-B),  we  will  need  a consistent  sequence  of 
estimators  for  A in  order  to  construct  an  approximate  large-sample  confidence 
region  for  B.  The  following  theorem,  which  follows  directly  from  Lemmas  2.2 

A -1  *2 

and  2.3,  both  summarizes  our  strong  consistency  results  for  B and  r (p+r)o  , 
and  provides  us  with  a strongly  consistent  sequence  of  estimators  for  A. 
Theorem  2.1.  Under  the  conditions  of  Lemma  2.1, 

(2.19)  lim  r-1(p+r)o2  - o2,  a.s., 

n-*« 

-1  A2 

so  that  r (p+r)a  is  a strongly  consistent  (sequence  of)  estimator(s)  for 
2 

o . Under  the  conditions  of  Leama  2.3, 


(2.20) 

and 


lim  B-B,  a.s., 
nr** 


(2.21) 

so  that 


M;  ""1(Su!W:h  - ■ 4. 

IP*  r 

| is  a strongly  consistent  (sequence  of)  estimator(s) 

! • - I'V*)n52(:p.l'»r1) 


a.s., 

for  B,  and 

•m 


(2.22) 
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is  a strongly  consistent  (sequence  of)  estimator(s)  for  A. 

Remark  I.  Weak  consistency  results  (i.e.,  convergence  in  probability)  for 
* -1  *2 

B and  r (p+r)o  have  been  obtained  previously  by  Gleser  and  Watson  (1973) 

when  r - p,  and  by  Bhargava  (1975)  in  the  general  case  r p.  Their  proof 

-1  *2 

of  consistency  for  r (p*r)o  is  given  under  slightly  weaker  conditions 
_ 2 

[n  = o(l)]  than  the  conditions  of  Lenina  2.1,  but  their  proof  of  the 

A 

consistency  of  B requires  the  condition  (1.13),  and  also  has  a theoretical 
gap  [noted  in  Gleser  and  Watson  (1973)].  The  full  strength  of  the  almost 
sure  convergence  results  given  in  this  section  are  not  really  needed  for 
deriving  the  large-sample  distributional  results  of  the  next  section. 

However,  the  methods  and  conclusions  in  this  section  are  of  interest  in 
their  own  right  (particularly  Lemma  2.1  and  the  proof  of  Lemma  2.3),  and 
Theorem  2.1  may  be  of  use  in  future  work  concerning  the  construction  of 
J asymptotically  consistent  and  efficient  fixed-diameter  sequential  confidence 

regions  [see  Gleser  (1965)]  and  asymptotically  optimal  Bayesian  sequential 
regional  estimators  [see  Gleser  and  Kunte  (1976)]  for  B. 

Remark  II.  We  once  again  call  attention  to  the  fact  that  no  argument  in  the 
present  section  requires  us  to  assume  that  the  common  distribution  of 
e2,...,  is  multivariate  normal. 

3.  Asymptotic  distributions.  We  begin  by  finding  the  large  sample 
distribution  of  n **(W  -jrfW).  Let  e'  - (elfe2, . . . »ep4.r)  be  a random 
vector  having  the  same  distribution  as  Cj,  e2,...,en  (the  columns  of  E). 

We  assume  that  < •,  i ■ l,2,...,p+r.  Let 

C3*1)  ■ ^eiejek«t)»  i.j.M  • 0,1,2,... ,p*r, 

with  the  understanding  that  eQ  = 1.  Thus,  ♦Qiii  ■ j^(e^)  and  so  forth.  Now, 
let 
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Note  from  (2.3)  that 

tf(W)  - n(o2Ip+r  ♦ n_1E5'), 

and  thus  that 


n~\*  -j^W))  - n”  ^EE'-na2I  ♦££'+££') 

•*  ■*»  ~p+r  «> *> 

(3.2)  = n-*5  l Z.  , 

k-1 

«l.ere  Zfc  . «*kij», 

(5-3>  'kij  * eklekj  - °\j  * 5ki'kj  * «hi'kj> 

and  is  the  Kronecker  delta.  The  matrices  Zj,  Z2>...,  are  mutually 
statistically  independent  (but  not  identically  distributed)  with 
■ 0,  k ■ l,2,...,n,  and 


cov(*kir*ki'j,)  " Vji’j*  ‘ ° 6ij6i'j'  + Cki'*0ijj' 


* 5kj'*0iji'  * 5ki*0i'j'j  + 5kj*0i'j'i 

♦ °2«nCn,« 


Let 


kjCkj'*ii'  + 5ki5ki,6jj'  * 5kj5ki'6ij'  * €ki€kj ,6i » J5 


:CiJ).(i’.j’)  ‘ n_1  jj  cov(2kij*2ki-j^‘ 


(3.S) 
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Then  for  all  (i,j),  i,j,i',j'  ■ l,2,...,p+r;  we  have 


(3.6) 


4 — — 

K(i* j) » (i' * j ')  " ' ° 6ij6i'j'  + 5i'*0ijj'  * 5j'*0iji 

+ Voi'j'j  * Voi'j'i  *a  2(Tjj,6ii' 

+ Tii'6jj'  * Tji'6ij*  * Tij 


where  the  existence  of 

(3.7)  ? - lim  n"1  J Z t = lim  n"1  £ ? 

n-"»  k=l  K1  1J  n-+~  k=l  kl  kJ 


is  guaranteed  by  (1.13). 

Theorem  3.1.  Under  the  assumptions  that  (1.13)  exists  (and  is  finite)  and 

4 

that  i^(ej)  < »,  i ■ l,2,...,p+r,  the  elements  on  and  below  the  diagonal 

(the  subdiagonal  elements)  of  n-,s(W  - #>(W))  have  a limiting  joint 

(P*r) (p+r+1) /2-dimensional  normal  distribution  with  mean  vector  0 and 

«• 

covariance  matrix  iT-  ((K(i, j) , (i* , j • ))) * 

Proof.  Let  W » ((w^)).  Consider  any  linear  combination 


(3.8) 


^ C«‘“U  ‘ ^ j,  Sj-M, 


n~H . I U. 


L 1 l ciizkiV 
k-1  i<j  KiJ 

of  the  subdiagonal  elements  of  n - tf(W)).  We  recognize  this  as  a 
normalized  sum  of  independent  random  variables.  Using  (3.3),  (3.4),  (3.5) 
and  the  assumption  that  the  fourth  moments  of  e exist,  it  is  straightforward 
to  prove  that 

plim  [ 7 var(  7 eyZ^Jl'1  £ ( ]>  c..z  )2  - 1. 

ikM»  k«l  i<j  i<j 
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It  then  follows  fro»  Raikov's  Theorem  [Gnedenko  and  Kolmogorov  (1954;  p.  143)] 


(3*9)  " \-l(Jj  ^N(0'  i<j  i-lj.  Vi'r  K(i.j).(i,.3,))' 

Since  (3.9)  holds  for  all  linear  combinations  (3.8),  the  conclusion  of  the 
theorem  follows.  □ 

2 

Remark.  Our  implicit  assumption  that  the  covariance  matrix  of  e is  o I_. 
is  unnecessary  for  the  proof  of  asymptotic  normality.  If  the  covariance 
matrix  of  e is  Z * ((o^j)),  then  the  same  conclusion  holds,  except  that 
Z replaces  o Ip+r  in  the  formula  for  i^(W),  and  in  the  formula  for 

K(i,j),(i',j')  in  (3>6)  we  have 

K(i* j) » (i*  # j ')  " ^iji'j'  ' °ijai' j ' + ^i'*0ijj'  + V*0iji' 

+ Voi'j'j  * Voi'j'i  + (Tjj,aii' 

+ Tii'ajj'  * Tji'aij'  + Tij'°i'j)- 

Corollary  3.1.  If  e^,  e^,...,  are  i.i.d.  multivariate  normal  with  mean 

vector  0 and  covariance  matrix  Z,  and  if  (1.13)  exists  (and  is  finite), 

u 

then  the  subdiagonal  elements  of  n (W  - nZ  - EH')  have  a limiting  joint 
(p+r) (p+r+1) /2-variate  normal  distribution  with  mean  vector  0 and  covariance 


itrixjr-  ((*(i,j)t(it  j')))  8iven  by 


C(i.J).(i\J')  " °ii'°j j ' * aij'°i'j  + Tjj'°ii' 


♦ Tii»0jj*  * Tji'°ij'  + Tij'°i'j' 


When  E " a ?pr» 
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(3.10)  IC(i,j),(i',j')  " ° (6ii'6jj'  + 6ij,6i'j)  + 0 (T^'6 


jj’  ii' 


♦ Tii'6jj«  + Tji'6ij'  * Tij  »6i* * 


Proof.  Because  eJ#  e2>...  are  i.i.d.  N(0,E),  we  have 


SO,  if  i ■ 0,  i'  ■ 0,  j ■ 0,  or  j ' ■ 0, 
°ii'°jj,+  °ij,0i'j  + 0ij°i,j'*  otherwise. 


The  result  of  the  Corollary  now  is  a direct  consequence  of  Theorem  3.1.  □ 
We  note  that  Corollary  3.1  gives  the  asymptotic  distribution  of  the 
noncentral  Wishart  matrix  in  cases  where  the  noncentrality  parameter  is 
0(n). 

h A 

To  find  the  asymptotic  distribution  of  n (B-B) , it  is  sufficient  to 
note  that  (1.9)  and  (1.10)  yield  the  representation: 


(3.11) 


(Ip,B')[n"V-ir(W)))(B,-Ir)' 

- n^CB-S) . (n-1G22n.i„G*2) 


Assuming  that  a is  positive  definite,  and  using  (2.13),  (2.14)  and  (3.11), 

L a 

we  conclude  that  n(B-B) ' and 

(3.12)  F - -A-1  (I  ♦B'B)"1(I  ,B*)(n_,s(W-i«W))](B,.I  )• 

« « »P  * « w • « w w* 

have  the  same  asymptotic  distribution.  Since  the  elements  of  F are  linear 

combinations  of  the  subdiagonal  elements  of  n->l(W> i^(N) ) , we  conclude  that 

•» 

when  the  assumptions  of  Theorem  3.1  hold  and  A is.  positive  definite,  the 
elements  of  n**(|-B) ' have  a limiting  rp- variate  normal  distribution  with  0 
mean  vector  and  a covariance  matrix  that  can  be  calculated  using  (3.6)  and 

(3.12) .  Since  the  covariance  matrix  of  the  limiting  distribution  of 
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j,  A 

n’(B-B)  under  the  general  conditions  of  Theorem  3.1  involves  fourth-order 

cross  moments  of  e,  and  thus  is  both  complicated  and  hard  to  estimate,  and 

since  we  are  primarily  interested  in  the  case  where  e^,  e2»...  are  i.i.d. 

2 

N(0,  o I ) , we  content  ourselves  with  the  following. 

- -P*r 

2 

Theorem  3.2.  If  e,t  e,,,...  are  i.i.d.  N(0,o  I ),  and  if  (1.13)  exists 

„ X mm  »P*r 

and  is  positive  definite,  then  the  elements  of  n^l-B) ’ have  a limiting 
joint  rp-variate  normal  distribution  with  zero  means  and  covariance  between 
the  (i,j)th  and  (i',j')th  elements  given  by: 

(3.13)  o2[o2(A"1(Ip*B'B)"1A"1)  ♦ • 

Proof.  The  asymptotic  normality  follows  from  the  preceeding  arguments.  The 
formula  (3.13)  may  be  obtained  from  (3.10),  (3.12),  and  straightforward 
calculation.  In  the  computation,  it  is  helpful  to  note  that  if  T * ((t^)) 
is  defined  by  (3.7),  then 

(3.14)  T 


(!') 


□ 


We  note  that  from  (2.14)  and  (2.22), 


lim  A~x(n~*G11DM<,vGj1)A~A  - A~a (A  + a‘(I  ♦B'B)"1)a"1,  a.s. 
n " **  **  **  **  ** 


and  from  (2.12), 


lim  (I  ♦BB •)  - (I  ♦BB»). 

~ — ~r  — 

It  then  follows  from  Theorem  3.2  that  an  asymptotic  100(l-a)%  elliptical 
confidence  region  for  B is: 
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{B:  tr [n(Ir+BB')-1  (B-B) -1A(B-B)  ' ] 


-1, 


.1*  » 


(3.15) 


< r-1 (p+r)o2x^p[l-a] }, 


2 2 
where  x^p[l-“]  is  the  100(l-a)th  percentile  of  the  xrp  distribution. 

Turning  next  to  the  question  of  the  asymptotic  distribution  of 
*4-1  *2  2 

n (r  (p+r)o  -o  ),  we  note  from  (2.16),  Lemma  2.3,  and  Theorem  3.1  that 

n_1°min  = ?22(!r+!!,)'1(5*-Ir)(n’1!!,H!'-Ir)  ’ (£r*“’) "1(®22>”1  + V" 
Since  it  also  follows  directly  from  Lemma  2.3  that 


lim  (Ir+BB,)'1(G^2)'1(G22)'1(Ir+BB*)'1 


(Ir+BB')_1,  a.s.. 


we  conclude  that 


n"ltr?min  " trfC!r+!!')’IS(?'"Ir)(n’1!!P(!'’!r)'(Ir+?!,)",Sl  + °p(n"*>» 
or  that 

n\r“1(p*r)o2-o2)-r1tr{(I  ♦M')"\b.-I J [n^(W-ifl!0)](B,-I  V(I  ♦BB')J|)  ♦ o (1)  . 

A **’  % ^ wT  •>  r ■*  ■»  p 

It  now  follows  directly  from  Theorem  3.1  that  the  limiting  distribution  of 

l *2  2 

n (*  (P*r)o  -q  ) is  univariate  normal  with  zero  mean,  and  a variance 

involving  B and  the  fourth-order  moments  of  e.  (Note.  To  obtain  this  result 

we  need  not  only  the  assuiqjtions  of  Theorem  3.1,  but  also  the  assumption 

that  A is  positive  definite.]  In  the  case  when  the  e. 's  are  i.i.d. 

2 

N(°,o  Ip+r)»  the  variance  of  the  asymptotic  distribution  greatly  simplifies, 
and  we  obtain  the  result: 

Theorem  3.3.  Under  the  assumptions  of  Theorem  3.2, 


(3.17) 


nV^p+rjS2-®2)  +N(0,2o4r-1). 
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2 

Thus,  an  approximate  100(l-a)%  confidence  interval  for  o is 

(3.18)  {a2:  | a2  - (nr)“1trD|linl  < (2xJ[l-a]/m)Js(nr)'1trDnin}. 

Remark.  The  methods  of  proof  used  in  this  section  differ  from  those 
usually  used  to  prove  asymptotic  normality  of  principal  components  [see 
Anderson  (1963)]  or  of  factor  loadings  [see  Anderson  and  Rubin  (1956)]. 

There  is,  of  course,  considerable  resemblance  between  the  model  (1.4)  used 
in  this  paper,  and  the  kinds  of  estimators  derived,  and  the  models  and 
estimators  of  principal  component  analysis  and  of  factor  analysis.  Indeed, 

A A 2 

a first  step  in  computing  B and  o is  to  obtain  a principal  components 

breakdown  of  the  cross-product  matrix  W;  but  we  must  note  that  in  our  model, 

2 

W is  noncentral  Wishart  with  covariance  matrix  parameter  a I r,  while 
principal  components  analysis  deals  with  a central  Wishart  matrix  with  a 
general  covariance  matrix  X.  The  analogy  of  our  model  to  factor  analysis 
with  fixed  factor  values  [see  Anderson  and  Rubin  (1956)  and  Lawley  (1953)] 
is  much  closer,  although  our  model  makes  very  restrictive  assumptions  about 
the  form  of  the  factor  loadings  and  error  covariance  matrix.  Even  though 
it  is  probably  possible  to  obtain  our  large  sample  results  by  specializing 
the  more  general  results  of  Anderson  and  Rubin  (1956) , our  approach  in  this 
section  has  the  advantage  of  directness.  Further,  the  representations  which 
we  have  used  may  yield  information  about  the  accuracy  of  our  large  sample 
approximations  in  finite  samples. 
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